537 research outputs found
Computable and Faithful Lower Bound for Entanglement Cost
Quantum entanglement is a crucial resource in quantum information processing.
However, quantifying the entanglement required to prepare quantum states and
implement quantum processes remains challenging. This paper proposes computable
and faithful lower bounds for the entanglement cost of general quantum states
and quantum channels. We introduce the concept of logarithmic -negativity, a
generalization of logarithmic negativity, to establish a general lower bound
for the entanglement cost of quantum states under quantum operations that
completely preserve the positivity of partial transpose (PPT). This bound is
efficiently computable via semidefinite programming and is non-zero for any
entangled state that is not PPT, making it faithful in the entanglement theory
with non-positive partial transpose. Furthermore, we delve into specific and
general examples to demonstrate the advantages of our proposed bounds compared
with previously known computable ones. Notably, we affirm the irreversibility
of asymptotic entanglement manipulation under PPT operations for full-rank
entangled states and the irreversibility of channel manipulation for amplitude
damping channels. We also establish the best-known lower bound for the
entanglement cost of arbitrary dimensional isotropic states. These findings
push the boundaries of understanding the structure of entanglement and the
fundamental limits of entanglement manipulation.Comment: 25 page
HFORD: High-Fidelity and Occlusion-Robust De-identification for Face Privacy Protection
With the popularity of smart devices and the development of computer vision
technology, concerns about face privacy protection are growing. The face
de-identification technique is a practical way to solve the identity protection
problem. The existing facial de-identification methods have revealed several
problems, including the impact on the realism of anonymized results when faced
with occlusions and the inability to maintain identity-irrelevant details in
anonymized results. We present a High-Fidelity and Occlusion-Robust
De-identification (HFORD) method to deal with these issues. This approach can
disentangle identities and attributes while preserving image-specific details
such as background, facial features (e.g., wrinkles), and lighting, even in
occluded scenes. To disentangle the latent codes in the GAN inversion space, we
introduce an Identity Disentanglement Module (IDM). This module selects the
latent codes that are closely related to the identity. It further separates the
latent codes into identity-related codes and attribute-related codes, enabling
the network to preserve attributes while only modifying the identity. To ensure
the preservation of image details and enhance the network's robustness to
occlusions, we propose an Attribute Retention Module (ARM). This module
adaptively preserves identity-irrelevant details and facial occlusions and
blends them into the generated results in a modulated manner. Extensive
experiments show that our method has higher quality, better detail fidelity,
and stronger occlusion robustness than other face de-identification methods
Statistical Analysis of Quantum State Learning Process in Quantum Neural Networks
Quantum neural networks (QNNs) have been a promising framework in pursuing
near-term quantum advantage in various fields, where many applications can be
viewed as learning a quantum state that encodes useful data. As a quantum
analog of probability distribution learning, quantum state learning is
theoretically and practically essential in quantum machine learning. In this
paper, we develop a no-go theorem for learning an unknown quantum state with
QNNs even starting from a high-fidelity initial state. We prove that when the
loss value is lower than a critical threshold, the probability of avoiding
local minima vanishes exponentially with the qubit count, while only grows
polynomially with the circuit depth. The curvature of local minima is
concentrated to the quantum Fisher information times a loss-dependent constant,
which characterizes the sensibility of the output state with respect to
parameters in QNNs. These results hold for any circuit structures,
initialization strategies, and work for both fixed ansatzes and adaptive
methods. Extensive numerical simulations are performed to validate our
theoretical results. Our findings place generic limits on good initial guesses
and adaptive methods for improving the learnability and scalability of QNNs,
and deepen the understanding of prior information's role in QNNs.Comment: 28 pages including appendix. To appear at NeurIPS 202
Diff-Privacy: Diffusion-based Face Privacy Protection
Privacy protection has become a top priority as the proliferation of AI
techniques has led to widespread collection and misuse of personal data.
Anonymization and visual identity information hiding are two important facial
privacy protection tasks that aim to remove identification characteristics from
facial images at the human perception level. However, they have a significant
difference in that the former aims to prevent the machine from recognizing
correctly, while the latter needs to ensure the accuracy of machine
recognition. Therefore, it is difficult to train a model to complete these two
tasks simultaneously. In this paper, we unify the task of anonymization and
visual identity information hiding and propose a novel face privacy protection
method based on diffusion models, dubbed Diff-Privacy. Specifically, we train
our proposed multi-scale image inversion module (MSI) to obtain a set of SDM
format conditional embeddings of the original image. Based on the conditional
embeddings, we design corresponding embedding scheduling strategies and
construct different energy functions during the denoising process to achieve
anonymization and visual identity information hiding. Extensive experiments
have been conducted to validate the effectiveness of our proposed framework in
protecting facial privacy.Comment: 17page
CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization
We propose CatVersion, an inversion-based method that learns the personalized
concept through a handful of examples. Subsequently, users can utilize text
prompts to generate images that embody the personalized concept, thereby
achieving text-to-image personalization. In contrast to existing approaches
that emphasize word embedding learning or parameter fine-tuning for the
diffusion model, which potentially causes concept dilution or overfitting, our
method concatenates embeddings on the feature-dense space of the text encoder
in the diffusion model to learn the gap between the personalized concept and
its base class, aiming to maximize the preservation of prior knowledge in
diffusion models while restoring the personalized concepts. To this end, we
first dissect the text encoder's integration in the image generation process to
identify the feature-dense space of the encoder. Afterward, we concatenate
embeddings on the Keys and Values in this space to learn the gap between the
personalized concept and its base class. In this way, the concatenated
embeddings ultimately manifest as a residual on the original attention output.
To more accurately and unbiasedly quantify the results of personalized image
generation, we improve the CLIP image alignment score based on masks.
Qualitatively and quantitatively, CatVersion helps to restore personalization
concepts more faithfully and enables more robust editing.Comment: For the project page, please visit
https://royzhao926.github.io/CatVersion-page
All-to-key Attention for Arbitrary Style Transfer
Attention-based arbitrary style transfer studies have shown promising
performance in synthesizing vivid local style details. They typically use the
all-to-all attention mechanism -- each position of content features is fully
matched to all positions of style features. However, all-to-all attention tends
to generate distorted style patterns and has quadratic complexity, limiting the
effectiveness and efficiency of arbitrary style transfer. In this paper, we
propose a novel all-to-key attention mechanism -- each position of content
features is matched to stable key positions of style features -- that is more
in line with the characteristics of style transfer. Specifically, it integrates
two newly proposed attention forms: distributed and progressive attention.
Distributed attention assigns attention to key style representations that
depict the style distribution of local regions; Progressive attention pays
attention from coarse-grained regions to fine-grained key positions. The
resultant module, dubbed StyA2K, shows extraordinary performance in preserving
the semantic structure and rendering consistent style patterns. Qualitative and
quantitative comparisons with state-of-the-art methods demonstrate the superior
performance of our approach
Enhancement of non-Stabilizerness within Indefinite Causal Order
In the field of quantum computation, the non-stabilizerness of a quantum
circuit is crucial for understanding and quantifying quantum speed-up. In this
work, we explore some intriguing phenomena regarding the non-stabilizerness of
a circuit when a Quantum SWITCH structure is employed. This structure is a
novel quantum construct that enables quantum states to pass through operations
in a superposition of different orders and has shown superiority in numerous
tasks over circuits with a definite causal order. Firstly, we discover that the
completely stabilizer-preserving operations, which cannot generate magic states
under standard conditions, can be transformed into a resourceful operation
capable of generating magic states when processed by the Quantum SWITCH.
Secondly, when considering the effects of noisy channels on operations, we
observe that while the non-stabilizerness of each path may be annihilated,
their superposition could still preserve the non-stabilizerness of the
operation. These findings reveal unique properties brought by the Quantum
SWITCH and open further avenues in future research on magic resources of
general quantum architecture.Comment: 5+4 pages, 4 figure
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
We present VideoReTalking, a new system to edit the faces of a real-world
talking head video according to input audio, producing a high-quality and
lip-syncing output video even with a different emotion. Our system disentangles
this objective into three sequential tasks: (1) face video generation with a
canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for
improving photo-realism. Given a talking-head video, we first modify the
expression of each frame according to the same expression template using the
expression editing network, resulting in a video with the canonical expression.
This video, together with the given audio, is then fed into the lip-sync
network to generate a lip-syncing video. Finally, we improve the photo-realism
of the synthesized faces through an identity-aware face enhancement network and
post-processing. We use learning-based approaches for all three steps and all
our modules can be tackled in a sequential pipeline without any user
intervention. Furthermore, our system is a generic approach that does not need
to be retrained to a specific person. Evaluations on two widely-used datasets
and in-the-wild examples demonstrate the superiority of our framework over
other state-of-the-art methods in terms of lip-sync accuracy and visual
quality.Comment: Accepted by SIGGRAPH Asia 2022 Conference Proceedings. Project page:
https://vinthony.github.io/video-retalking
- …